Written Report
Abtract
This project pulls from a few different files all regarding information about national parks in the United States that I used to create a shiny app with the purpose of providing a very basic education about national parks. To build the map I used a shape file from the national park service to plot all the areas that they manage. This not only includes those with the national park title, but also every designation of a national part site (which is quite an extensive list). To add onto the map, I scraped the National Parks Wikipedia page to get some extra information about the true national parks so users could see some more information when they clicked on a chosen park. In addition to the map, there are data tables that list out relevant information for the national parks and other areas featured on the map. I also used more data provided by the National Parks Service to plot information regarding the amount of visitors to each of the national parks over a range of years. This data was used to create a couple of graphs displaying visitor trends in national parks.
Introduction
This app utilizes data and visualizations to offer an opportunity for expanding knowledge about our national parks and delivering a fundamental education about them.
The map and data tables use the following information:
A shape file provided by the National Park Service. The variables I used from this file are:
- UNIT_NAME, The name of the park, site, or area that will appear on the map. This is what shows when the user hovers over any of the blue areas and is also used as the name for the data table “All_Parks”.
- UNIT_TPE, The classification of each area that the park service manages. This is displayed on the data table “All_Parks” and the user can view areas by type classification.
- STATE, The state each area is in, displayed on the “All_Parks” data table and the user can view areas by state.
- METADATA, The link to the national park service page about each area doesn’t actually provide much information but is displayed on the “All_Parks” data table.
To designate the areas with the “National Park” title and separate them for plotting on the map and the data tables, I used the data that I scraped from Wikipedia and joined them by name. The relevant variables from Wikipedia are:
- Name, The name of the National Park. This is used in the map when the user clicks on green areas and is also displayed in the data table “Just_National_Parks”
- State, The state the park is located in. Appears in “Just_National_Parks” table and the user can view parks by state designation.
- Established, Gives the date the park was established. This appears in the popup window when a park is clicked on and can also be used in the table “Just_National_Parks” to sort by oldest or newest parks.
- Area (in acres), Gives the size of the parks in acres. This can be used to sort the parks in the “Just_National_Parks” by size.
- Visitors in 2022), Shows the amount of visitors to each park in 2022. This appears in the popup window and can also be used to sort “Just_National_Parks” by amount of visitors.
- Description, A brief description of each park. This appears in the popup window and the data table “Just_National_Parks”.
Creating the map boosts user interaction with information about national parks and creates an interesting visualization of these areas. It displays the size and geographic location of each area to help users get a better idea of where they are and how much space is covered by parks and other national sites in the country. The data tables provide a gathered place of information that allows for the posing of questions about the areas shown in the map. This is where the majority of questions and education about national parks takes place. The information from the Wikipedia page allows users to explore details of these areas like the oldest/newest park or the biggest/smallest park, and many more investigative questions.
The visitors plots use the following information:
– An excel file published by the National Parks Service containing the visitor count for 400 national park sites from 1906 to 2016. The variables I used are:
- Unit Name, The name of the park or area
- Unit Type, The classification type of the area
- Visitors, A visitor count for each year along with a collective total amount
- YearRaw, Each year that the data was collected and a total amount for all years (labeled “Total”)
Using this data I created two interactive graphs that allow the user to view the visitors over a range of years, either looking at a collection of parks selected by the user or one park specifically to view more in depth. Adding this information allows the user to learn about visitor trends and how highly trafficked some of these areas are. And while this data set does provide visitation numbers for 400 of the areas the national park service manages, I filtered it down to just keep the areas with the “National Park” classification to simplify the options and visuals. I also changed the year range from 1905 to 2016, to 1995 through 2016 in order to provide more relevant information and reduce the clutter of the visuals. Some of the questions that can be posed using the addition of this data include the most/least popular year for a chosen park, which out of a group of parks was most visited in a chosen year(s), and general trends of visitors.
Visualizations
Map with popup
This is the map that I created using the national parks shape file joined with the information scraped from the Wikipedia page. It is color coded to indicate the difference between the areas labeled as National Park with more information (green) and the other areas managed by the park service (blue). The green areas provide a popup with the additional information, while the blue areas only provide the name of the area when hovered over. However, given the large size of the shape file, I can only publish a visualization with one of the data sets. So the published version can only show the green areas I am using to show those with the title of National Park.